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Summary 

We  study  the  limiting  behavior  of  sums  of  functions  of  nearest  neighbor 
—  ‘ 

wV 

distances,  for  an  m  dimensional  sample.  We  establisti)  a  central  limit 
theorem  and  moment  bounds  for  such  sums^and  an  invariance  pripciple  for 
the  empirical  orocess  of  nearest  neighbor  distances.^  As  a  consequence 
obtain,  the  asymptotic  behavior  of  a  practicable  goodness  of  fit  test 
based  on  nearest  neighbor  distances. 
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1.  Introduction  and  Background 


In  many  areas,  there  has  been  a  long-standing  need  for  a  multidimensional 

2 

goodness-of-f it  test  that  is  general,  in  the  sense  that  the  x  afid 

Kolmogorov-Smirnov  test  are  general  in  one  dimension,  and  also,  is  prac- 

2 

tical  in  a  computational  sense.  Of  course,  x  is  still  available  in  any 
number  of  dimensions,  but  its  usefulness  and  practicality  are  virtually 
nil  in  high-dimensional  spaces. 

Take  Xi,...,X^  to  be  n  points  in  m-dimensional  Euclidean  space 
selected  independently  from  a  distribution  with  density  f(x).  Define  the 
nearest  neighbor  distance  from  X^  as 


min  IlX.-X.ll  . 
l^i^j^n  ^  ^ 


In  what  follows  we  suppress  the  dependence  of  and  related  quantities 
on  n  unless  confusion  is  likely. 

The  distance  d(x,y)  between  points  does  not  have  to  be  Euclidean. 

But  we  assume  that  it  is  generated  by  a  norm  llxll,  i.e.  d(x,y)  =  llx-yll. 

This  paper  started  with  the  attempt  to  derive  the  limiting  distribu¬ 
tion  of  a  goodness  of  fit  test  for  multidimensional  densities  based  on  the 
nearest  neighbor  distances.  We  established  a  form  of  the  invariance  prin¬ 
ciple.  Our  work  had  two  main  byproducts:  a  central  limit  theorem  for 
sums  of  functions  of  nearest  neighbor  distances  and  4^*^  order  moment  bounds. 
These  two  pieces  were  then  put  together  to  get  the  invariance  result. 


The  goodness  of  fit  test: 

In  looking  for  a  practical  goodness-of-f i t  test  applicable  to  densities 
in  an  arbitrary  number  of  dimensions,  our  starting  point  was  the  observation, 
essentially  contained  in  the  work  by  Loftsgaarden  and  Quesenberry  (1965) 


that  the  variables 


U,-n  =  exp[-n  f(x)dx]  ,  j=l,...,n 

J|ix-X.lt<R. 

J  J 

where  f(x)  is  the  underlying  density,  . X^  are  n  points  sampled  inde¬ 

pendently  from  f(x)  and  R.  is  the  distance  from  X.  to  its  nearest  neighbor, 

w  J 

have  a  univariate  distribution  that,  in  any  norm  ll•ll  distance 
a;  does  not  depend  on  f(x) 
b;  is  approximately  uniform. 

The  reasoning  is  simple:  let  S(x,r)  be  the  sphere  with  center  at  x  and 
radius  r.  For  any  Borel  set  A,  denote 

» 

F(A)  =  f(y)dy  . 

Ja 

Assume  X^  is  the  first  point  selected,  then  the  other  n-1 .  The  set  {R.|  ^r.j 
is  equal  to  the  event  that  none  of  the  X2,...,X^  fall  in  the  interior  of 
the  sphere  of  radius  r-j  about  X.| .  Hence 

P(R^  >r.,  |X^  =x^)  =  [1  -F(S(x.,,r^))]"’^  . 

Since  for  fixed  x,  F(S(x,r))  is  monotonical ly  nondecreasing  in  r,  write 
the  above  as 

P[F(S(R^,x^))  >F(S(r^,x.,))|X^  =  x^  ]  =  [1  -  F(S(r^  ,x., )  )]"•''  . 
Substituting  z  =  F(S(x.|,r^))  gives 
(1.1)  PCF(S(x^,R^))  >ziX^  =x^]  =  (1-z)""'' 

PCF(S(X^,Rl))iz]  =  (1-z)""^  . 


so  that 
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An  example  of  a  measure  of  deviation  of  the  W.  variables  from  the  uniform 
is  the  statistic 


where  j=l,...,n,  are  the  ordered  variables.  Notice  that 


S  =  n 


1 


(H(x)  -  x)^dH(x) 


where  H(x)  is  the  sample  d.f.  of  the  W. 


The  invariance  principle: 

This  leads  us  more  generally  to  studying  the  stochastic  process  H(y); 
Of_yj_l,  and  test  statistics  based  on  measures  of  the  deviation  of  H  from 
the  uniform  or,  more  appropriately,  on  the  deviations  of  H  from  its  expec- 
tation  EH.  We  had  conjectured,  based  on  some  simulation  studies,  that 
statistics  such  as  S  were  asymptotically  distribution  free  under  the  null 
hypothesis.  More  generally,  we  had  conjectured  that  the  limiting  distribu- 

_  /N 

tion  of  /n(H(t)  -  t)  was  a  Gaussian  process  with  zero  mean  and  a  covariance 
not  depending  on  f(x).  Our  main  result,  as  given  in  Section  5,  is  that 
this  is  almost  true.  What  holds  is  that  for  the  sequence  of  processes 


Zn(t)  =  /n(H(t)  -EH(t)) 

where  Z(t),  0  <_  t  ^  1 ,  is  a  zero  mean  Gaussian  process  whose  covariance 
depends  on  the  hypothesized  density  g  and  true  density  f,  and  indeed  if 
g  =  f,  then  the  covariance  does  not  depend  on  f.  The  proof  of  this  theorem 
and  other  results  related  to  the  goodness-of-fit  test  are  given  in  Section  5. 
Defining  variables  by 
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then  W.  has  the  form 
jn 

“j  = 

and,  denoting  the  indicator  function  fay  !(•). 

Z„(t)  =  v^(F{t)  -EF(t))  =  lJ[l(W.it)  -EI(W.  <t)] 

=  —  SChCX-.D.)  -Eh(X.,D.)] 

/J7  '  J  J  J  J 

for  an  appropriate  h. 

This  identification  suggests  that  the  appropriate  tools  for  the 
invariance  principle  are  a  central  limit  theorem  and  moment  bounds  and 
convergence  theorems  for  sums  of  functions  of  nearest  neighbor  distances. 

A  central  limit  theorem: 

The  central  limit  result  established  in  Sections  3  and  4  is  that  for 
a  function  h(x,d)  on  E^'”^XC0,»)  — *-E^^ ^  such  that  h  is  uniformly  bounded 
and  almost  everywhere  continuous  with  respect  to  Lebesgue  measure, 

Var(-L  ["h(X.,0.))  <  » 

/n 


and 


—  I?  h*(X.,D.)  ^  N(0,a^) 

./n"  J  J 
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where  we  make  the  convention  here  and  through  the  rest  of  the  paper  that 
for  any  function  h{Xj,Dj) 

h*(Xj.Dj)  =  h(Xj.Oj)  -  EMXj.Dj) 

This  is  generalized  to  a  multidimensional  central  limit  theorem,  and  used 
to  give  the  result  that 

£} 

(Z„(t,),....Z„(t^))  -  (Z(t^),...,Z(t^)) 

Our  proof  is  long.  We  believe  that  this  is  due  to  the  complexity 
of  the  problem.  Nearest  neighbor  distances  are  not  independent.  But  for 
large  sample  size  the  nearest  neighbor  distance  to  a  point  in  one  region 
of  space  is  "almost"  independent  of  the  nearest  neighbor  distances  in 
another  region  of  space.  The  main  idea  for  capitalizing  on  this  large 
scale  independence  is  to  cut  the  space  into  a  finite  number  of  cells.  For 
any  point  in  a  given  cell,  let  its  revised  nearest  neighbor  distance  be 
defined  using  only  its  neighbors  in  the  same  cell .  The  first  step,  then, 
is  to  show  that  asymptotically  the  revised  nearest  neighbor  distances  can 
be  substituted  for  the  original  nearest  neighbor  distances.  Now,  given 
the  number  of  points  in  each  cell,  the  set  of  interpoint  distances  within 
the  cell  is  independent  of  those  within  any  other  cell.  Therefore, 
given  the  total  cell  populations,  any  sum  of  functions  of  the  revised 
nearest  neighbor  distances  is  a  sum  of  independent  components,  with  each 
such  component  being  the  sum  of  the  functions  of  the  nearest  neighbor 
distances  within  a  particular  cell. 
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However,  the  multinomial  fluctuation  of  the  cell  population  is  not 
asymptotically  negligible.  Thus,  the  limiting  distribution  breaks  into  a. 
sum  of  two  parts,  one  being  the  nearly  normal  sum  of  the  independent  cell 
components  given  the  expected  value  of  the  cell  populations.  The  other  is 
an  asymptotically  normal  contribution  due  to  the  fluctuations  of  the  cell 
populations  from  their  expected  values.  The  limiting  form  of  the  variance 
reflects  the  nature  of  the  problem.  It  has  one  term  that  would  be  the 
variance  if  all  nearest  neighbor  distances  were  assumed  independent.  Then 
there  are  a  number  of  other,  more  complex,  terms  arising  from  the  local 
dependence. 

A  moment  bound: 

Both  the  central  limit  theorem  and  the  tightness  argument 
required  for  the  invariance  proof  rely  on  moment  bounds.  Again, 
there  is  some  difficulty  in  untangling  the  dependence  between  nearest 
neighbor  distances  and  proving  bounds  of  the  type  required. 


For  example,  we  show  in  Section  2  that  for  any  measurable  func¬ 
tion  h  on  E^'"^xC0,«)  ^  with 

lih||  =  sup(h(x,d)  I  <  » 

there  is  a  constant  M  <  «  depending  only,  in  a  specified  and  useful  way, 
on  h  and  the  dimension  m  such  that 


E 
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Both  the  central  limit  theorem  and  the  moment  inequalities  (which 
improve  results  in  Rogers  (1977))  should  prove  generally  useful  in  methods 
employing  nearest  neighbor  distances. 

The  plan  of  the  presentation  is 
Section  2:  moment  bounds 
Section  3:  2^^  moment  convergence 

Section  4:  central  limit  theorem 
Section  5:  invariance  and  the  goodness-of-fit  test 
Appendix:  technical  results  on  nearest  neighbor  distances 

Section  2  on  moment  bounds  is  long  and  somewhat  complex.  But  the 
results  are  needed  in  the  later  proofs.  The  main  results  of  statistical 
interest  are  in  Sections  4  and  5. 

Assumptions  on  the  densities: 

Our  general  assumptions  on  the  density  f(x)  are  that  it  be  uniformly 
bounded  and  continuous  on  its  support.  These  requirements  can  probably 
be  weakened,  but  the  price  may  not  be  worth  the  extra  generality.  The 
following  conditions  are  listed  to  make  the  requirements  formal. 

A:  We  can  choose  a  version  of  f  such  that 

(i)  {f  >0}  is  open 

(ii)  f  is  continuous  on  {f>0} 

(iii)  f  is  uniformly  bounded. 

Corresponding  to  A  we  have: 

B:  The  given  function  g  is  nonnegative  and 
(i)  (g  >0}  D  {f  >0} 

(ii)  g  is  continuous  on  (f  >01. 

Clearly  essentially  all  situations  of  interest  are  covered  by  A  and  3. 


The  central  result  of  this  section  is  the  4^*^  order  moment  bound 
(2.2)  which  is  used  to  prove  tightness  via  Corollary  2.5.  We  believe  it 
will  prove  generally  useful  in  the  study  of  procedures  based  on  nearest 
neighbors.  Its  formulation  and  spirit  owe  much  to  the  excellent  thesis 
of  W.  R.  Rogers  (1977).  Our  method  of  proof  is,  however,  different  from 
his  and  suited  to  the  rather  delicate  estimates  we  must  make. 

The  proof  of  the  central  limit  theorem  requires  only  the  use  of  the 
nd 

2  °  order  moment  bounds  given  in  Lemma  2.11  and  its  Corollary  2.15.  The 
proofs  of  2.11  and  2.15  are  given  early  in  this  section  and  the  reader 
interested  only  in  the  central  limit  problem  may  wish  to  skip  the  rest  of 
the  section. 


The  following  notation  is  used: 

P  is  the  probability  measure  making  X.|,...,X^  i.i.d.  with  common 
density  f. 

E  without  subscript  is  expectation  under  P. 

R^  is  the  nearest  neighbor  distance  to  X^. . 

J^-  is  the  index  of  the  nearest  neighbor  point  to  X^. . 

D.  =  n^'^'^R. 

1(A)  is  the  indicator  of  an  event. 

F(A)  =  I  f(y)dy 

Ja 

S(x,r)  =  (y;  lly-xll  ;f_r} 

S^.  =  S(X.,R.) 

For  h  a  measurable  function  on  ^<[0,^)  — ►  E^^^,  denote 


Throughout  this  section  M,  with  or  without  a  subscript,  denotes  a 


finite  generic  constant  depending  only  on  the  dimension  m. 

Theorem  2.1 :  If  llhll  <  then 

(2.2)  E(^.  h*)^  <  Mn^llhll^[E^lh^  I  +nVlh^  |F^(S^) +n"^llhll^]  . 

Before  giving  the  proof  of  the  theorem  we  give  two  corollaries. 
Corollary  2.3:  Suppose  u  and  w  are  bounded  functions  and 

h(x,d)  =  u(x)w(x,d)  . 

Then  there  is  a  constant  C  <  “depending  on  Hull,  llwll,  m  such  that 
(2.4)  h*)"^  <  C(nV|u(X^)l  +n)  . 

Proof:  The  corollary  follows  from 
E|h^  I  <  ||w||Elu(X^)l 

E|h^  |f2(S.,)  <  llwllE{E|u(X^)|E(F^(S^))|x^)}  =  ilwllE  |  u(X^ )  1^^^^ 
where  the  last  equality  follows  from  (1.1). 

Corollary  2.5:  If 

h(x,d)  =  I(a  ^g(x)d^  ^b) 

then 

(2.6)  E(V.  h*)^  <  M{n^(G^(b) -G^(a))^ +n} 

where  ^^(y).  y  ^  0,  is  the  distribution  function. defined  by 

Gp(y)  =  (1  -exp(-^))"'|f(x)[l  -exp[-^(S(x,(y/ng(x))^^'^))]|dx 


Proof:  Let 


0(x)  =  F(S(x,(j^)’/")) 

B(x)  =  F(S(x.(j^)’'"”)) 

Then,  for  j  ^0,  defining  =  F(S(x,a)),  Pg  =  F(S(x,3)), 

E(ih^  |fJ(S^)|x^  =x)  =  E[fJ(S(x.R^)I(p^<  F(S(x,R^))  <P3)|X^  =  x]  . 

.  no 

=  u'^(n-1  )(l-u)^  “^du 

•^P 

a 

£  Mn"'^  j  -^)'^"^dw 

^nPa 

or 

np  np- 

(2.7)  Edhi  |fJ(S^)|x^  =x)  <  Mjn''^(exp(— -exp(-^))  . 

If  we  now  apply  Theorem  2.1  and  use  (2.7)  for  j  =  0,1  the  lemma  follows. 

The  proof  of  Theorem  2.1  proceeds  by  a  construction  similar  to  one 
used  by  Rogers  and  a  series  of  lemmas. 

We  assume  that  we  are  given  a  measurable  set  S  C  R*^,  F(S)  <  1,  and 
a  set  of  r  <  n  points,  x  =  (x^,...,x^),  where  the  x^  are  fixed  points  in 
X.  Let  Q^(*lS,x)  be  the  probability  measure  on  (r"’)'^  such  that  X^,...,X^_^ 
are  independent  identically  distributed  with  their  common  distribution 
being  the  conditional  distribution  F(*|S  )  and  =  x^. ,  i  =l,...,r. 

We  write  FC-lS^")  as  F^.  Its  density  is,  of  course, 

fs(x)  =  f(x)/F(s'^)  ,  X  S 

=  0  otherwise. 

We  typically  write  Q  for  Q  (‘iS.x),  and  Eq  to  denote  the  expectation 

f'  - 

under  Q^. 


On  a  common  probability  space  take  i.i.d.  F  and 

i.i.d.  F(‘lS^)  and  independent  o^  the  X^.  and  define, 

=  X^.  if  i=l,...,n-r  and  X^.cS'’ 

=  Y^.  if  i=l,,..,n-r  and  X^.cS 
=  ^i-n+r  i=n-r+l . .  ,n 


'Ki  ’\, 


Clearly  Xp...,x^  have  joint  distribution  Q^.  Let  be  the  nearest 
neighbor  distance  of  in  the  set  X-j , . . .  ,X^  and  D^. ,  be  defined 


similarly. 


Lemma  2.8:  For  n  >  r,  there  is  a  constant  M  such  that 
—— —  ~  0 


[£jj  h(X,,0^)  -  E  h(X^.D^)i  <  iihllM  (f  +  F(S)) 

■r  ®  ' ' 


?rooT_:  -or  r  >_  n/Z,  the  bound  holds  trivially.  Ror  n/2  >  r, 


(2.9) 


£q  h(XpO^)  -  £  h(XpD^)i  = 


(n-r)''i  X  [£  h(X  0  )  -  £  h('X. ,[).)]' 
i=l  ’  ’  ^  ^ 

1  (n-r)’"'  £  I  _  !h(Xi,D.j)  -  h(X.,d^.)l 


1  (n-r)*^  \\h\\El  {I(X,A)  -  I(X,=i  .  R.  =  R.)} 

i=l 
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Let 

N  =  I  KX.t^X.) 
i=1  ^  ^ 

the  number  of  "changed"  points  among  the  first  n-r.  Note  that  EN  =  (n-r)F(S/. 
Now 


X.=X.)  <  I_  !(J^=j,  J,;  =  k,X,rX..  or 
j  >  k 


and  hence 

(2.10)  I_  KR^-fR^.  X.=x.)  <  iKXjfx.)  I  -  I(j,=j) 

liKr\)I  i(Jrk) 

k  ^  k  ' 

<  2a(m)(N+r) 


by  corollary  SI  of  the  appendix. 


From  (2.9)  -  (2.10)  and  the  boundedness  of  h, 


i 


!  !hi 


!  I 


'(H2a(m))F(s) 


< 


:h;!2(U23(m))(F(s 


and  the  lemma  is  proved. 
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Lemma  2.11:  For  llgll,  llhll  <  denote  h-j  =  h(X^,D^),  ^2  ~  9(^2  >02).  Then 


for  n  >  4, 


|cov(h^,g2)l  1  M^llgll(n"^E|h^  I  +Ejh^F(S^)|)  . 


Proof:  Write 


|cov(h^,g2)l  <  ^ 


(2.12) 


Moreover, 


[Jt=2] 


(2.13) 


CJi/2] 


h^g2dP  = 


h^{E(g2|X^.Xj  ,J^)  -Eg2}dP 


On  the  set  ^  2,  given  X^  =  ,  Xj  =  X2,  the  (X^ ,  2  <_j  ^n,  j  /  ;  ^1  ^ 

are  distributed  according  to  Q2(  *  |S(x^ ,  | X2-x.|  |),(x^  ,X2) ) .  By  Lemma  2.8 


(2.14) 


[JTf2] 


.o.'l92dP  1  |h^iM,llgll(2n-'.F(S^))dP 


<  4MQtlgl![n''Elh^  |  +E|h^F(S^)|] 


and  the  lemma  follows  from  (2.12)-(2.14) 


Corollary  2.15:  For  llhll,  llgll  <  ”,  and  for  n  ^  4, 


lcov(h^,g2)i  1  M2llgll(Eh^)^/^/n 


Proof:  From  (1.1)  it  follows  that  EF'^(S.j)  =  2/n(n+l).  Now  apply  the 
Schwartz  inequality. 
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The  bounds  in  Lemma  2.11  and  Corollary  2.15  can  clearly  be  made  symmetric 
in  h.|  and  92*  We  use  them  primarily  for 

2 

Lemma  2.16: 

Proof:  Let  (X^X^j) . joint  distribution  as  the 

vector  {(X.|  ,X.| ),...  ,(X^,X^)}  and  be  independent  of  that  vector.  Let 
primes  on  0^.  etc.  as  usual  denote  calculations  based  on  the 

appropriate  sample.  Then 

(2.17)  cov  (hph2)  -  coVq  (hph2)  =  j  £  A 

A  “  (hi "hi ) (h2“h2)  "  (hi "hi ) (h2"h2) 

where 

I  II  '\i'  'V'V/' 

hi  =  h(Xi,Di),  h.  »  h(Xi,di),  h^  =  h(Xi,0i) 

The  proof  proceeds  by  a  series  of  steps. 

Let 


I  I  'V.l 

E.  »  {hith.} 


I(£l)  <  KX^fX,)  +  I(X.=Xi, 


Since 
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(2.18)  max{P(E.  n  £.) ,  P(E^.  A  E,^)  ;  all  i  +  f2(S) 


Lama  A.l  and  elementary  arguments  yield  that 


2  *  c 

Since  *  0  on  CU^._i{E^-UE.}]  ,  (2.18)  and  symmetry  arguments  imply 


( 2. 1 9)  1  EA|  <  4  I  E(h^  -h"^}(h2-h2)  I  ( E^  E^CeJ  I^CE^]^ ) ' 


N  ^  -  F^(S) 

\n  j 


Using  lemma  A.l  again  we  bound  the  first  term  on  the  right  hand  side 
of  (2.19)  by, 

(2.20)  4|E{(h^.ft^)(h2-h2)(I(J-,?2,"J^;«2.X2=3(2)Cl(X^;^3(^)+I(;(^*)(^,  R^ff^^)]))! 


+  M||h||  2(1:^  +  f2(S) 

\n  ) 


-  '’•u 


Let  5  =  {i  :  X.j^X.}.  Given  =,  X.  .ieH.X.,  ,Xj  ,X^  ,rj  ,X^  and  'k^'k^  the 
variables  X.j,...,X^  can  be  permuted  to  have  a 

Q^(-|s(x^,Ri)us(x^,r^).{x.,  ieE,  x^,Xj^,Xj^}) 
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distribution  with  X2  in  the  lead  and  r  =  N+I(X^»X^)  +  "*■ 

Conditioning  on  this  information  within  the  expectation  in  (2.20)  and  using 

I 

the  independence  of  h2  we  can  apply  lemma  2.8  to  the  difference  between 

I 

the  conditional  expectation  of  h2  and  Eh2  and  bound  the  first  term  in 
(2.20)  by 


(2.21) 


4Mhll^  MQ(m)  e|(I(X^J?X^)  +  I(X^=X^  )  ^^ 


+  F(S^)  +  F(S^)| 

Estimates  of  the  order  -^  +  F  (S)  for  all  the  terms  in  (2.21)  are  given 
in  lemma  A, 2.  Combining  (2.19)  -  (2.21)  the  lemma  follows. 


Lemma  2.22: 


(2.23) 


KWaI  fl^ll  + 

\  n 


★  ★  ★  ★ 


Proof:  Let  £^3  ’  CJ-] 


2n-3 


Then, 


(2.24) 


«  /  h*h2  {covp  (h.,,h2) 


+  (Ep  h^-Eh^)ndP 


Qy.  ’  Q^(-|S(XTRT)US(X2,R2)«f''<V''^2’'^J.,  ■^J2'‘'^"‘^  ^ 


where 
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Apply  lemmas  2.8,  2.11  and  2.16  to  get. 


(2.25)  [j  tt  dPl  <  (m^  llhll  (n“’E|h^i  +  Elh^FCS^)!))  X  h*h*dP! 

Eio  \  /  E,o 


•'^2  IIM!^  I  +  r2(S2))dP 


(2.26)  = 

f  ■k  it 

+  2]  h,h. 

CJ2=3,J^^{3,4}]  ^ 

Condition  in  the  first  integral  on  the  right  in  (2.26)  by  ,Xj^ 
and  apply  lemma  2.8  to  get  the  bound 


(2.27) 


2MJ|hM  /  ih*i  (n-''  +  F(S,))dP 


14Mo  (n'''E|h^I  +  E!h^F(ST)|) 

by  the  usual  symmetry  argument.  Condition  in  the  second  integral  by 
X2.Xj  ,^2  and  obtain  a  bound  as  in  (2.27).  Conclude  that 

I /.  h*h*|  <  !cov(h^,h2)|  +  M  (Elh^l  n'^  -  E!h,F(S^) 

^12 


20 


and  hence  that  the  first  term  in  (2.25)  is  bounded  by 


(2.28) 


On  the  other  hand,  applying  lemma  2.8  again 


(2.29)  ||h*h2|(n'2  +  f2(S^))  <  |!h(!  f 


[Ji=23' 


'h*i(n’2  +  c2(s  )) 


+  /  |h!|(n-2  ^  F2(S,)){Elh2l  +  M  llh||  (n'^  +  -(S^))} 

U0Tt2]  '  1^0 

^  2 

The  first  term  in  (2.29)  is  iM|lh||  "n"  by  the  usual  symmetry  argument. 
The  second  is 

iM(E4h^!n“2+  E|h^l  £lh^|F2(S^))  +  Hhil^n-^ 

(2.30)  <  M(2(E4h^|n“2  +  n^  E^jh^[F2(:.|))  +  ![h||^n“^) 

and  hence  combining  (2.28)  and  (2.30)  we  get 


(2.31) 


ij  dP|  <  M|Ih||2  —  ^  ^  E4hT|F(ST) 

^^12  \  n^  '  ' 

+  n^  E4hT|F2(S^)  +  Ilhjl  ^  n"^)  . 


iVow  consider 


(2.32) 


/, 


TT  dP 


[^7=3] 


=3,8321(2,4}] 


Ttdp  + 


2  f  iT  dP 

[J, =3, 83  =  2] 
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Next  write, 

(2.35)  /  rrdP  a  /  irdP  +  /  tt  dP 

Lo-j  =3  ,J2’2]  [31-3  ,J2t4]  CJi*3  ,J2®2,J2®4] 


Now 

(2.36)  P[J.|a3,J2*2,J2*4]  =  ^  ^PCJl“3,J2=2,J2*i] 

<  (n-3)"^  PCJt=3,J3=2]  <  (n-3)"^n-2)'^PCJ^*33 

<  M  n"^ 


Hence, 

(2.37)  l/  -irdPl  <  Mj  n'^ 

[J^a3,J2a2,J2=*4] 

Next  condition  on  X.|  ,X2,X2,Ji  ,32.33,^1  .R2’*^3  first  term  of  (2.35) 

and  apply  lemma  2.8  to  get 
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PCJt=3,  J3=2j  <  Mn"^ 


as  in  (2.36)  and  similarly, 


(2,39) 


f  F(ST)dP  <  (n-2)"''  /  F(S,)dP 

^[Jr3.J3=2]  ^  ^[3^-3] 


=  [(n-2)(n-lj]"''  EF(S^)  <  Mn“^ 


(2,40) 


f  F(S.)dP  =  (n-2)’''  f  FiS.)!  I(J,-=3)dP 

-'CJr3.J3=2]  2  ^  if2.3  ' 


by  corollary  Si , 


<  (n-2)’^a(m)  f  F(S7)dP 


1  C(n-2)(n-l)]‘''a^(m) /F(S2)dP  < 


(2.41) 


f  F(SJdP  <  C(n-2)(n-l)]*^a(m)EF(SJ  <  Mn"^ 

^CJt*3,J3=2]  ^  “ 


Combining  these  estimates  with  (2.38),  (2,37)  and  (2.35)  we  get. 


(2.42) 


if  TT  dPl  <  Ml  lh|  1*^0"^ 

'''Cj^»3,J3»23  ‘  - 
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and  hence  from  (2.32),  (2.34)  and  (2.42), 


-2;,.  . 


(2.43) 


I  r  TTdPi  <  Ml  lh|  I 

^[^1=3] 


■.m2 


u  i 


-  I  h,  I  r  (Si  ) 


+  I  Ihi n'^ 


Next  consider, 


(2.44) 


f  TTdp  =  f  rrdP  -  f  rrd? 

^[J2=3,J^J;{3,4}]  ^[32=3] 


/, 


T7d? 


[J2~3  ,  J-j  *4] 


Of  these  terms  the  first  is  bounded  in  (2.43).  The  next  is  written. 


(2.45) 


f  TTdP  r  [ 

^[J^=J2=3,  J3V4J  ^[Jt=32="’ 


TTdP 


The  second  term  in  (2.45)  is  bounded  by  n'"^  as  in  (2.40).  The 

first  (conditioning  on  X,,Xo,X,,  etc.)  is  bounded  by 

M!|h!r  r  (n’'  +  I  ^  '"(S,-))dP 

^[J^=J2=3]  i=l 

'  4  -3 

and  again  by  Mi|h|l  n"  by  arguing  as  in  (2.39)  -  (2.41).  For  examole, 


/F(s,)dP<^  /r:3,)dM.( 


■'^1 


(m) rn(n-l ) (n-2) ' 


Final ly, 


(2.46) 


by  lemma  2. 
(2.47) 

Mow  by  the 


=3,J^=4] 


2=3,J^=4] 


< 


1  [(n-3)(n-2)]'''  lih!  E!h*h*! 

<  Mn"^|(h||^  (E^lh^l  +  cov(  ( h^J^j ,  1  h*| )) 


<  Mn 


-2 


f2 


11.  By  our  discussion  and  (2.43)  -  (2.46), 


Schwartz  inequality, 

E^ih^lF(ST)  <  E;h^|  E;h.,iF2(S.) 

'n,  I 

1  “1  I  0  0  7 

<  - n-£2lh,iF-(3,) 


The  lemma,  therefore,  follows  from  (2.31)  and  (2.47). 


Lemma 

(2.49) 

Proof: 

denote 


while 


^.48:  For  Mg  <  “ 


. . 


The  argument  goes  much  as  for  lenria  2.22 and  is  sketched.  If  we 


the  integrand  by  ft 


I/:  ,  irVl  <N||t.||  (n-’£|h,]  t  E|h,[F(S,)) 


|2  ('..'"Ipiu  I  j.  -c2|t  itrFc  \  j.  llul|2  «“2^ 


<  M||h|r  Cn‘'E|h^  1  +  nE^jh^  |F(S^)  + 


n  >, 


\f  [hr]2h*h*dPl  <  llhll^  f  lh*h3|dP  <  Mn"''|lh|l2  /[hXldP 

-'[0t=2]  “  -'rj,»2l  '  ^  \  C 


<  M|  [hi  n’^  (£^|h^  I  +  n“‘  |  |h!  1“^)  arguing  as  in  (2.46) 


“1  M  u I  I 2\ 


The  lemma  follows. 


Proof  of  Theorem:  Wri te 


(2.50)  E(2.  in  ECh*]"^  +  6n(n-1)  ECh*]^Ch2]^ 

+  6n(n-l)(n-2)  !ECh*]^h*h*|  +  n(n-l ) (n-2) (n-3) | Eh*h*h3h*i 

We  apply  lemmas  2.22  and  2.48  to  the  last  two  terms  of  (2.50);  note  that  the 
second  term  is 

<  6n^|[h||2  (£^|h*!  +  |cov(ih*|,|h2l)|) 

and  apply  lemma  2.11 ,  and  bound  ECh^]^  by  I6|jh||^. 


The  theorem  follows. 


3.  Second  Moment  Convergence 


The  central  result  of  this  section  is  the  evaluation  of  the  limit 

of  Var(—  l'}  h(X,-,D.))^  for  a  certain  class  of  functions  h.  Starting  with 
/n 

the  density  f(x),  define 

y(x)  =  f(x)''/'"  , 

and  for  any  measurable  function  h  on  x  [0,“)  “’Ei  *  Ist 

R(x,r)  =  h(x,Y(x)r)  . 


Define 

Lq,  ,  L 

(3.1) 

Lo(r) 

(3.2) 

(r^ ,r2) 

(3.3) 

L2(ri,r2) 

where 

-V(r,)-V(r.) 
=  e 

-V(r,)-V(r2) 

—  a  *  ^ 


CV(r^ )  + V(r2)  - V(r^ )V(r2)] 


B(r^ ,r2) 


V(r, .r-.z) 

(e  -  1  )dz  -  V(max(r^  ,r2))] 


B(r^,r2)  =  {z;  max(r^  ,r2)  f_llzll  f_r^+r2} 

V(r, ,rp,z)  =  dy 

‘  •'S(0,r^)ns(z,r2) 

For  any  two  functions  h,  h'  define  the  functional  L(h,h')  by 

(3.4)  L(h,h')  =  h(x.|  ,r^  )h'(x2,r2)f(x.,  )f(x2)L^  (dr^  ,dr2)dx^dx2 

+  h(x,r^ )h'(x,r2)f(x)L2(dri ,dr2)dx 


The  moment  convergence  result  is 


and  satisfies 


Theorem  3.5:  If  h  is  measurable  on  xC0,»)  — 

(i)  llhll  <  “ 

(ii)  the  set  of  discontinuities  of  h  has  Lebesgue  measure  0, 

then 

Var(-L  h(X.,D.))  ^  a^(h) 

/n" 

where 

(3.6)  a^(h)  =  |h^(x,r)f(x)LQ(dr)dx  -  [|h(x,r)f (x)LQ(dr)dx]^  +  L(h,h)  . 

As  the  proof  will  reveal,  the  first  two  terms  of  (3.6)  would  be  the 
limit  if  the  R.  were  independent.  The  L(h,h)  term  is  contributed  by  the 

J 

local  dependence  of  the  nearest  neighbor  distances. 

The  proof  of  the  theorem  is  split  into  two  pieces.  Proposition  3.7 
below  shows  that  the  diagonal  terms  in 

h*(X,.D,))2 

converge  to  the  first  two  terms  of  (3.6).  Then  proposition  3.20  gives 
convergence  of  the  off-diagonal  terms  to  L(h,h).  We  assume  throughout  that 
the  conditions  of  the  theorem  hold. 

Let  X,  D  be  a  random  m  vector  and  nonnegative  random  variable 
respectively  such  that  X  has  density  f  and 

PCD  >r|X]  =  exp{-f(X)V(r)}  . 

Equivalently,  D/y(X)  is  independent  of  X  and 

PC0/y(X)  >r]  =  LQ(r)  . 
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Proposition  3.7:  Let  f  satisfy  A(i)-(iii).  Then,  as 

where  (Xi>D-|p)  is  used  to  stand  generically  for  the  common  law  of  any  of 
the  pairs  (X^.  ,0^. )  and  — *■  denotes  convergence  in  distribution.  Therefore 

(3.8)  Eh(XpD^^)  —  h(x.r)f(x)LQ(dr)dx 

(3.9)  Var  h(X^,D.j^)  h^(x,r)f (x)LQ(dr)dx  -  (  h(x,r)f (x)Lg(dr)dx)^  . 
Proof:  Almost  immediate,  since 

P(Din  >rlX^  =x)  ^  =  P(D  >r|X  =x) 

and  the  set  of  discontinuities  of  h  has  probability  zero  with  respect  to 
the  (X,D)  distribution. 

Proposition  3.10:  For  h(x,r)  any  function  satisfying  the  hypothesis  of 
theorem  3.5 

n  Cov(h(X^,D^),h(X2,D2))  L(h,h)  . 

Proof:  It  is,  we  assert,  sufficient  to  show  for  any  two  functions  4)^ 
of  the  form 

(3.11)  ())^(x,r)  =  g.j(x)I(r  >  r. )  ,  i  =1,  2 
with  g.j(x)  uniformly  continuous  and  bounded,  that 

(3.12)  n  Co V ( (()i  ( X-j , D.| )  ,(1)2 ( X2 , D2 ) )  *■  L((t)i  ,<1)2) 

To  see  this  note  that  if  t*  is  the  set  of  all  finite  linear  combinations 
of  functions  of  the  form  (3.11)  then  we  can  get  a  sequence  h|^  eysuch  that 


31 


llh^n  <  2llhll 


and  with  respect  to  L-measure  on  '<[0,»),  h|^— >-h  a.e.  (since  h  is 
a.e.  continuous).  Now 

{3.13)-  Cov{h(XpD^),h(X2.D2))  -  Cov(h^(X^  .D^ )  ,h^(X2 ,02) ) 

=  Cov(h(X^,D^)  -h^(XpO^),h(X2,D2) +h^(X2,D2))  . 

Using  corollary  2.15  on  (3.13)  gives  the  bound 

TTm|Cov(h(X^  ,0^  ),h(X2,D2))  -  Cov(h^(X^  ,D^ )  .h^(X2 ,02) )  |  <  ci|h||(E|h-h^|^)’/^ 

2 

Now  the  bounded  convergence  theorem  gives  E(h-h|^)  — ^-0,  and  (3.12) 
implies  that 

Cov(h^(X^,D^).h^(X2.D2))  —  L(h^,h^)  . 

Since  L(h|^,h|^)  — •■L(h,h) ,  the  assertion  follows. 

Proof  of  (3.12) :  For  i  *1,2,  let 


and  let 


S^.  =  S(x.,n-''/"’r.)  .  F.  =  F(S.)  ,  F^2  = 

A  =  {(x^,X2);  llx^-X2ll  >n"^^'^(r^+r2)} 

B  =  ((x.|,X2)-,  n’^'^'”max(ri  ,r2)  il|x.|-X2ll  in"^^'^(r^+r2)} 

C  *  {(x.|,X2),  llx^-X2ll  ^n’^'^'’’max(r^  ,r2);  . 


Then 


P(R^  ^2 ^^'^r'2lX^=Xi  ,X2=X2)  = 


(1-Ft-F2) 


n-2 


,  ( x.|  ,  X2 )  G  A 


,n-2 


,  ( Xi  , X2 )  ^  C 


P(R.>n  ?2.|X^  =  X.)  =  (l-F.)'"' 


Then,  denoting 


L(xT,X2.r^,r2)  »  P(Ri>n  ,R2>n  ,X^^x^)  -  [(l-F^  )(1-F2)]' 

and  (x^ )  by  g . ,  f(x^)  by  f . , 

Cov(0^  ,<*2)  =  (x^)g2(x2)L(x^  ,X2,r^  ,r2)f(x^  )f(x^)dx^dx^ 

♦/9,92C(1-F,-V')2)"‘^  -  (l-F,-F2)"-^]f,f2 


■/^9,92:(l-F,-2)'’'^3f, 


'1  *  >2  -  '3 


Because  nF^  f.^V(r^),  where  f  is  the  supremum  of  f,  and  nF.  — (x^.  )V(r^. ) ,  for 


fixed  x^ ,  X2 


nC(1-F^-F2)"'^  -  (i-F^)"’^(l-F2)"’^]  = 


n-2/ 1  r  30-2 


n(l-F,)"-^(l.F 


I-F^)(1-F2) 


(1-Fi)(1-F2) 


-1'(x,  )'/(r,  )-f(x-)Y(r,) 

3  “  Cnx^)V(r^)+f(x2)V(r2)-f(x^)f(x2)V(r^)V(r2)] 


Furthermore,  the  convergence  is  bounded.  Therefore 


n  j"o(xpr^  )4(x2,r2)L^(dr,  ,dr2)f(x^  )f(x2)dx^dx2 

I 

as  can  be  seen  by  making  the  transformations  V(r^.)  =  f(x^.)V(r^.) 
In  l2’^3  transformation 

.  ""ni 

X2  “  x^  r  n  2  , 

leading  to 

3  =  {(x^,2);  max  (r^,r2)  £  llzlll'']  + 

C  -  {(x-i  ,z);  llzll  <  max  (r^  ,r2)} 

On  BUG,  for  x-j  fixed 

f(x2)g2(^2^  *  f(xT)g2(x^) 

uniformly,  and 

n  F,  -  f(x^)V(r.) 
n  F^2 

where 

V(r^,r2,z)  =/dy 
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Therefore 


r  f  f(x)V(r, ,rp,z}  -f(x)[V(r,+V(r2)]  2 

![!  (e  -l)dz]e  g,  (x)g2(x)r  (x)dx 

J  Jb 


A  simpler  argument  gives 

-f(x)CV(r,)+V(r2)]  2,  ^ 

nij  — *  V(max(r^ ,r2))e  gi(x)g2(x)f  (x)dx  . 

In  both  integrals,  make  the  substitution  V(r^l)  =  f(x)V(r^. )  and  add  the 
limits  together  to  get  the  proposition. 
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4.  A  Central  Limit  Theorem 

The  main  result  of  this  section  is 


Theorem  4.1:  Suppose  the  set  of  discontinuities  of  h  has  Lebesgue  measure 
0  in  C0,<»)  and 

sup  lh|  =  llhll  <  «>  . 
x,d 

Then  if  the  density  of  the  distribution  satisfies  A(i)-(iii), 

(4.2)  h*(X.,D.)  ^  N(0,a^(h)) 

/n 

2 

where  a  (h)  is  given  in  Theorem  3.5. 

The  proof  proceeds  in  a  series  of  propositions. 


Notational  convention: 

Lower  case  c  denotes  a  constant  depending  only  on  m  and  llhll.  The 
dependence-  of  other  constants  on  various  auxiliary  parameters  introduced 
below  will  be  noted  as  needed. 


Proposition  4.3:  There  exists  a  sequence  of  bounded  sets  C  with 
^  *"N+1 

1 )  diameter(C|^)  ^  N 


2)  inf  f(x)  =  6.  >  0 

x€r  ^ 

x<=Cn 

3)  PCXGC^)  —  0 


Proof:  There  exist  compact  sets  !  f  dx  — 1 .  Choose 


5|m  >  0  such  that  5|y[  dx  — 0.  Let 

IN  INJ, 


F|y,  =  (x;  f(x)  >6^} 
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and  take  Then 


In  preparation  for  the  next  step,  let  Dj^  be  a  cube  of  side  N  such 
that  C|^  C  Dj^.  Divide  D|^  into  L  =  (k)^  congruent  subcubes  D|y|  2,  =  1,...,L, 

and  let 

"  °N,£  ‘"N  ’  £  =  1 ,L 

B  =  U  3(BJ 
£  ^ 

where  9  denotes  boundary.  The  £  =  1,...,L  provide  the  basic  cells  such 
that  nearest  neighbor  links  between  different  cells  will  be  cut.  From  now 
on  until  the  end  of  the  string  of  propositions  N  and  the  B„,  £  =  !,,. .,L 
will  be  f i xed . 

Select  >  0  and  let 

Eji^  “  {xj  x^Cji^,  d(x,B)  ^djyj} 

where  d(x,B)  is  the  distance  from  x  to  the  set  B.  Write  (X,D)  for  (X,|,D,|^). 
Note  that  by  using  f(x)  £  sup  f(x)  =  f,  we  get 

X 

P(XeC^,  d(X,B)<d^)  <  2md^J/V"''f  . 

Now  let  • 

h(x,d)  =  I(xeE,^)h(x,d)  . 

We  suppress  dependence  on  N,  L  here  and  in  the  sequel  except  where 

★  * 

emphasis  is  needed.  Denote  (recalling  that  h  =  h-Eh,  h  =  h-Eh), 
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Z„(N.L)  . 

Proposition  4.4:  E(Z^  -  Z^(N,L))^  i  c(P(X 
Proof:  This  follows  directly  from  corollary  2.15. 

For  the  next  step,  define 

f  0  if  X.  e  B„,  no  other  X.  e  Bo 

r;  =  J  ^  ^ 

inf  llX.-X.i  if  X.  S  B. 

i^j  1  J  J  ^ 

and  redefine  h(x,0)  =  0.  Let  =  n^^^Rl  and 

«J  «J 

Z'(N.U)  h*(Xj,Dj)  . 

2  -(n-l)e,.V(d.) 

Proposition  4.5:  £(Z^(N,L)  -Z^(N,L))  cne  where  >  0 

depends  only  on  N. 

Proof:  E(Z„(N,L) -Z;(N,L))^  <  j  E(Ij 

where 

Aj  =  h(Xj,Dj)  -  h(Xj,Dj)  -  E(h(X^.,Dj)  -h(Xj,Dj))  . 
so 

E(Z^(N,L)  -Z;i(N,L))2  <  r  E(h(Xj,Dj)  -h(Xj,Dj))^  . 

Now  X.  e  E|m  and  d(X.,B)  >  R.  implies  Rl  =  R-.  So 

E(Z^(N,L)  -Z'(N,L))^  <  2llhll^  I-  X^SE^) 

<  21lhli^nP(d(X,B)  <  R,  xeE^) 
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where  (X,R)  stands  for  (X-],R^^)  by  our  usual  convention.  Now 


P(R>r|X  =  x)  =  [1  -F(S(x,r))] 


n-1 


Note  that  d(X,B)  £  N/m  for  X  e  Now 

inf  inf_  CF(S(x,r) )/V(r) ]  =  e  >  0 
x6C(^  0£r</m  N  ^ 

since  M(r,x)  =  F(S{x,r) )/V(r)  is  jointly  continuous  on  [0,/m  N]  where 
C,^  is  the  closure  of  0,^,  and  since  M(r,x)  >  0  everywhere  in  C|^;<[o,/m  N]. 


'N 

Therefore 


P(R>d(X,B),  XeE^)  < 


For  X  e  E|^^,  d(x,B)  >  dj^,  so 


XEE 


-(n-l)£  V(d(x,B)) 
e  f(x)dx 


N 


P(R  >d(X,B),  X€E^)  <  e 


-(n-l)£j^V(d,^) 


and  the  proposition  follows. 


For  the  next  step,  put  Bq  =  C^,  and  denote 

P(xe°  )  =  P£  ,  ^  =  0,1,. ..,L 

so  p^  =  1 .  (Assume  that  for  every  Z,  p^  >  0,  otherwise  delete  B^.) 

"l  •  ■'(XjSB.) 


Let 


so  the  (nQ,...,n|^)  have  a  multinomial  distribution  with  parameters 

(Pq,  . . .  ,p^^) .  Consider  the  following  construction:  draw  numu  ;  nQ,...,n|^ 

[n^  =  n  from  a  multinomial  distribution  with  parameters  (p^, . . .  ,P|^) .  Then 
( £l 

put  n^^  points  XI  ,  1  =l,...,n^  into  B^  using  the  distribution 

FJdx)  =  P(xedx!XeB.O  . 
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Denote  by  the  joint  distribution  of  i  =l,...,n^,  let  be  the 

nearest  neighbor  distance  to  XI  '  from  the  other  points  in  and 
qU)  ^  Put 


l"*,  !;(x|*>,  d!*-’),  n. 


,  "ill 


4,  =  i;=,  MXj.oj) 

Proposition  4.6:  There  are  constants  y_  ,, ,  =  such  that  y_  ,  — y. 

“  n ,  ^  n  5 .4  -6 

and  P 

E(E(T^|n,)-ET,-(VE",^„,^)‘^  £<:(«)  <» 

where  C(il)  is  independent  of  n. 


Proof:  Define 


Wj(r|x,nj)  =  >r|xj'> -x) 


=  Cl  -Fj(S(x,r-n-’/'”))]"' 


Note  that 


Define 


E{T^\n^)  =  nj^|h(x,r)Wj(dr|x,nj)Fj(dx) 


X^(rix)  =  Wj^(rlx,np^) 


i/m 

=  [1  -F^(S(x,rn-^/^))]  '' 


and  suppressing  the  dependence  on  L,  let 


Wj(rlx,nj)  =  x„ 


1 
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Then 

■  SppXn"  Xn<‘^''|x) 

=  V>x?'lXn 

where  d  -  X^(dr|x).  This  is  zero  for  =  -1 ,  so  we  eliminate  this  set 
in  the  expectations  to  follow.  Writing  n^^  =  (np^^-l  )yj^  +  np^  leads  to  the 
expression 

(4.7)  E(Tj|n,)  =  np,(Uu„)2  Ihx""  d x„  d P ^  -  P„(Hu„)  jux^dp^dP^  . 

The  expectation  of  the  square  of  the  second  term  in  (4.7)  above  is  bounded 
2 

by  Cj^llhl!  /n,  and  is  henceforth  ignored. 

Next,  expand 

x“"  ■  '  +  P„109X„  *  -^log  x^jV"  , 

where  0  <_  0  _5  1 ,  and  substitute  into  the  first  term  of  (4.7).  We  assert 
that  all  terms  containing  a  power  of  higher  than  one  have  squares  whose 
expectations  are  uniformly  bounded  in  n.  For  example 

(npj^)^E(u^  h(log  Xp)dXn  1  (np^)^llhllEu^  <  Cllh.|  11^(1 -p^)^ 

3nd  0y  2 

(np^)^E(uJ(l+u^)^|h(log  x^l^X^ 

<  llhll^(npjj^)^Efu^(l+y^)^  (log  Xn)^Xp  '^dx^dP^^ 

^  2ilhll^(np^)^[E{u^(l+u^)”^;  -1  <u^<0}  +  E{y^(l+u^);  u^>0}] 

<  Cj^llhll^  . 

Therefore 


(4.8) 


E(Tj^in^)  =  np^  jh(l+u^(2  +  log  Xn))dXndP^  +  02(1) 
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so 

(4.9)  E(Tjn^)  -ET^  =  np^y^  |h(2+log  Xn)dXn  dP^  +  02(1) 

where  02(1)  in  (4.8)  and  (4.9)  denote  quantities  such  that 
2 

sup  E(0«(1))  <  «>.  Letting  the  y  „  of  the  proposition  be  defined  by 

n  ^  n  j  -c 

’^n.J  °  Jh(2+log  x„)clX„<)Pi 

The  proof  will  be  completed  by  showing  that  the  integral  on  the  right  above 
converges.  For  x  fixed,  ^  non-increasing  function  of  r  such 

that  for  X  e  Int(Bj^) 

Xn(f'|x)  ^  =  Xo('"Ix)  • 

Since  h(x,r)  is  a.s.  continuous  with  respect  to  dxQ  dP^^,  then 

jMx„dP,- jhdxocip,  . 

Now  let 

X^(r|x)  =  (1  -  log  Xn(’"|x))Xn(>'|x) 

so  that 

X^(dr|x)  =  -(log  Xn('"|x)  )Xn(dr|x)  . 

For  X  e  Int(B^) 

X^ir\x)  -  (1  +f(x)V(r))e-^^^^^("^  =  x^irlx) 

and  so 


(4.10) 


h(log  Xn)dXn  dP^  —  -JhdxQ  dP^  . 
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Proposition  4.11:  ^  ^ 

/  n  ’ 

where 

^N,L  ^  • 

Moreover,  n'''E(I^^^  [E(Tjnj^)-E(Tj^)]^  ‘^N.f 


Proof:  Clear  from  the  preceding  proposition. 


It  is  useful  to  recall  the  dependence  of  parameters  on  N  and  L  at  this 


point. 


Proposition  4.12:  Let 


(4.13) 


V  n 


Then  there  is  a  constant  Sj^  <  “  such  that 

E(U^ln^,...,n|_)  s^^L  . 

Proof:  Given  n  =  n.|,...,n^,  the  terms  in  the  sum  for  are  independent. 


E(u2|n^,...,nJ  =  lpar(Tjn,)  , 

Var(Tjn^)  =  nj^VarChCxj'^^oj^^ )  | n^) 

+  njj^(n^-l)Cov(h(x|^^D{^^),h(X^^\D^^^)ln^) 


it  is  then  sufficient  to  show  that 


Var(h(xj^\D|^^)  !nj^)  constant 

n  Cov(h(x|^^D{'^^),h(X^^\D^^^)!nJ  ^ 


constant  . 
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This  result  can  be  gotten  through  a  simple  modification  of  propositions  3.7 
and  3.10. 

Now  we  are  ready  for  the  final  steps.  We  can  write 
(4.14)  Z'(N,L) , 

with  defined  in  (4.13)  and 

<^r 

By  =  we  mean  equality  in  distribution  when  L)^  and  have  the  joint  distribu 

2  c 

tion  we  have  implicitly  given  them.  Denote  e|^  =  P(XeE|^). 

2 

Proposition  4.15:  If  a  =  lim^  Var(Z^),  then 

Proof:  By  propositions  4.4  and  4.5 

(4.16)  Tli  E(Z^-Z;(N,L))^  <  ce^  . 

n 

Use  the  inequality 

(4.17)  iEz2-EZ;,^N,L)l  1  E  |  -  Z;(N,L)  | ^  +  2/E(Z^ ) ^E(Z^-Z;,(N ,L)  )  ^ 


and  take  n  — *•<»  to  get  the  result. 


_  I  1 3  -1 

Proposition  4.18:  Let  a  =  /max  p,,  and  take  |tr  <  Oi  .  Note  that  a  depends 

Z 

on  both  N  and  L.  Let  g^(t;N,L)  denote  the  characteristic  function  of  Z^(N,L). 


1  im 
n 


|g^(t;N,L)  - e 


<  ca  t 


Then 
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Proof: 


g^(t;N,L)  =  E  e 


it(U^.Vn) 


itV„  itU„ 

=  E(e  "^ECe  In))  ,  n  =  (Hq,..  .,n|_)  . 

Given  n,  A^,  with  the  independent  and  having  the  conditional 

distribution  of  given  n^^.  Hence 


E(e  "in)  =  nf,(t)  ,  f,(t)  =  E(e  ^|nj  . 


Applying  corollary  2.3  to  A^^, 


E(A^|nj^)  <  c^(n^/n)  ,  |aJ|  |n^)  1  C2(nj^/n)^/^ 


where  Cj^  will  denote  constants  depending  only  on  m,  llhll,  and  9|^  will  be 
quantities  such  that  |0|^1  _<  1 .  Then 


t^  2 

t  r/A^ 


-fj^(t)|  l^(A;in^)  <  (c^/2)t"(n^/n) 

2 

|f^(t)  -1  +V^(A^|n^)|  <  C2it|^(n^/n)^/2 

•  ^  1^1/^  ^  ^  -l/2;0  Ha-Pi*  aa 


Temporarily  restrict  t  to  the  range  jtja  £  cj  '  /2.  Define 


=  {max(n  /n)  <  2  max  p.)  . 

n  x,  £  ^ 


On  B^,  |1  -  ■fjj^(t)  I  _<  1/4,  hence 


log  f Jt)  =  logCl  -(1  -f„(t))] 


2 

=  -V^^£i"£)  e^c^lt^Kn^/n)^/^  +  e2C3t^n,/n)^ 


nf^(t)  =  exp(-Yp(A^|n^)  +A^) 


where,  since  jt  |a  <  1 
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_<  C2|t^|a +C2|t^l<^^  ^  c^|t|^oi  . 


Therefore 


|e  "-I  I  £  Cgltj^a 


2  2 

and  so,  denoting  =  E(U^ln) 


-8^t^/2 

|nf^(t)-e  llCgltl^a 


holds  on  for  all  t  such  that  |t^|  £  a"  ,  and  ]t|a  £  ^^/2.  Write 


g^{t;N,L)  =  E(I(B^)e 


it(U  +V  )  it(U  +V„) 

i  +  E(I(B^)e  . 


Since  P(B^)  — *0,  the  second  term  goes  to  zero,  so 


2  2 

_  itV„-6„tV2 

lim  l9^(t-,N,L)  -Ee 


<  Cgjt  !cc 


Combining  this  with  propositions  4.11  and  4.12 


(Sm  ,+^m  , )t^/2. 


lim  Ig  (t;N,L) -e 


<  Cglt  la  . 


To  complete  the  proof  we  need  only  remove  the  restriction  |t|a  <  cj’^^^/2. 
But  this  can  clearly  be  done  by  increasing  the  constant  c^. 


The  stage  is  now  set  for  the  proof  of  Theorem  4.1.  By  (4.16) 


lgn(t)-gn(t;N,L)  I  llim^  E|exp{it(Z^-Z^(N,L)}  - 1  |  <  |t|/^  , 

where  gp(t)  is  the  characteristic  function  of  Z^.  So,  by  proposition  4.18, 

2 

(4.19)  Tim^  ign(t)  -exp{-(s^^L+a^^l_)^}|  £  c(!ti\  +  !tU^) 
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3 

for  it|  a  £  1.  Now  let  N —*■<»,  L  — in  such  a  way  that  a—^0  and  e|^— >-0. 

By  proposition  4.15,  if  e|^— ^-0,  uniformly  in  L, 

3 

Since  the  restriction  lt|  a  <_  1  is  satisfied  eventually  for  any  fixed  t, 
as  a— *-0  we  conclude  that,  for  all  t, 

2  2 

and  (4.1)  follows  since  the  equality  of  a  and  a  (h)  is  derived  from  the 
moment  convergence  theorem  3.5. 

By  considering  linear  combinations  of  h's  it  is  clear  how  the  results 
can  be  generalized  to  provide  a  multidimensional  central  limit  theorem,  and 
the  moment  convergence  theorem  3.5  can  be  easily  modified  to  give  the 
limiting  form  of  the  covariance  matrix. 
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5.  The  Process  H(t)  and  Goodness-of-Fft 

First,  a  Glivenko-Cantelli  type  theorem  is  established  for  H(t). 


(5.1) 


X(x)  = 


;  g(x)  >  0 


*  ;  g(x)  =  0 


and  define  a  d.f.  H  by. 


(5.2) 


0  <  t  <  1 


,  t  >  1 


(5.3) 


a  »  H(l)  -  H(l-)  *  PCg(X^)  »  0]  . 


Note  that  if  f»g,  then  ~j.sQ  and  H  is  the  d.f.  of  the  uniform  distribution 


Theorem  5.4:  If  A(1ii)  holds,  as  rr*«. 


(5.5) 


sup  ^|H(y)  -  H(y)! 


Proof:  We  begin  by  showing. 


(5.6) 


H(y)  -  H(y)  a.s.  Y  0  ^  y  <  1 


(5.7) 


H(l-)  -  1-a  »  H(l-),  a.s. 


To  prove  (5.6)  note  that  by  corollary  2.3, 


PClH(y)  -  EH(y)l  >  £]  =  0(n‘2) 
and  hence  by  the  Sorel -Cantel 1 i  lemma, 

(5.8)  H(y)  -  EH(y)  -  0  a.s.  yO<y<l 

Assertion  (5.6)  then  follows  by  using  (3.7)  to  show  that 

A 

EH(y)  H(y).  Next  (5-7)  is  an  immediate  consequence  of  the  S.L.L.N 
To  complete  the  proof  of  the  theorem,  let 

,  0  <  y  <  1 

H(l-) 

(5.9)  H  (y)  = 

1  .  y  i  1 

and  define  H  similarly  in  relation  to  H.  By  (5.6)  and  (5.7)  H  converges  in 

if  * 

law  to  H  with  probability  1.  But  H  is  continuous  and  hence  by  Polya  s 
theorem, 

(5.10)  sup  y  (H*(y)  -  H*(y)i  V’  0 
ana  (5.5)  follows  from  (5.10)  and  (5.7). 


De’^ne  a  s 

tochast'c  orocsss  on 

1  ^  ' 

1  .  1  j 

5y . 

(5.11) 

7 

^n 

(t)  =  vn  (h 

(0)  -  EH( 

-  /  /  t 

0  £ 

t  £  1 

and  a 

ccr'-espondi ng  Gaussian 

process  Z 

w"  th 

mean 

Z  whose  ■ 

y(3 ,t) 

,  3  £  t,  is 

defined  by 

(5.12) 

y(s,c)  =J 

-  (log  s_ 

Iks'-f 

/.Y 

+  log  tf> 

r 

log  s  log 

log  sJa( 

st)^f 

*/>,(: 

st)""|(7^ 

B(3,t) 

\ 

'^e  write  A, 

f  for  a(x) , 

f(x)  etc 

.) 

wners 


3(s,t)  =  jw;  Ilwil  £ 


:g  r.3,:,wj  = 


I 


dz 


S(0,r,)  n  3(w,r,) 


where 

y(r^)  =  -log  s 
'^(r^)  =  -log  t 

\ 

f=g,  then  Y(s,t),  s  <  t,  reduces  to 
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(SJS)  ''(s,t)  =  3  -  3t(T  +  log  t  log  s  log  t)  -  3tj'(7(s ,t ,w} -1  )dw 

8(s ,t) 

Clearly  the  processes  Z^(*)  can  be  identified  with  probability 
measures  on  0[Q,1]  and  it  will  follow  as  a  consequence  of  our  proof  that 
Z(-)  can  be  as  well.  In  fact,  if  a  =  0,  Z(*)  has  a.s.  continuous  sample 
functions.  Our  main  result  is, 

Theorem  5.14:  Suppose  that  A  and  3  hold,  "^hen, 


Z 


n 


Z 


in  the  sense  of  weak  convergence  in  OfO,!]  where  Z  is  as  above  and  has  a.s. 
continuous  sample  functions. 

Before  giving  the  proof  we  state  and  prove  the  corollary  of  greatest 
interest  to  us. 

Let 

S.  =  n  ^  (H(t)  -  £H(t))^  dt 

t)  -  EH(t)]-  dH(t'  =  I  -  7' 

•si'  ■ 

and  A  holds,  both  3„  and  3,  tend  law  to 


dt  where  Z  has  covariance  '^unct'on  i'5.13'. 
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The  corollary  is,  for  Sg,  an  immediate  consequence  of  Theorem  5.2.  Sy 
writing 

Si  =  f  ll  (H’^(t))  dt 
0 

we  see  that  the  corollary  follows  in  this  case  from  Theorems  5.1  and  5.2. 

Notes:  1)  The  theorem  can  be  extended  to  the  case  a  >  0  by  a  conditioning 
argument  as  in  Section  2.  Of  course  the  Z  process  is  then  continuous  only 
on  [0,1)  and  has  a  jump  at  1 . 

A 

2)  It  is  not  possible  in  Theorem  5.1  to  replace  EH  in  the  defini- 

A 

tion  of  Z^by  H.  Although  EH(t)  -*•  H(t),  the  difference  is  of  the  order  of 
n  and  will  not  be  negligible  for  m  >  3. 

Proof  of  Theorem  5.14:  We  begin  by  establishing  the  tightness  of  the  Z^ 
sequence  using  the  4  ^  moment  bound  proven  in  Section  2.  Let  Ri,...,R^  be 
as  in  Section  2  and  recall  that 


1 

. "  ■ 

Lemma  5.16:  If  A(iii)  and  8  hold,  the  sequence  of  processes  {Z^}  is  tight  in 
0C0,1]  and  any  weak  limit  point  is  in  CC0,1]. 
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Proof:  We  use  a  device  due  to  Shorack  (1973). 
Note  that; 


1 

T 


<  ^^1^)  ■  P(9(X,;)0?  < 


where  is  t.he  volume  of  the  unit  sphere  in  Let 


Q„(t) 


'-’°9  t 


m 


where  is  given  in  corollary  2.5.  Note  that  by  B  and  the  dominated 
convergence  theorem  is  continuous.  For  given  5  >  0,  let  t^<...<tj^  be 
such  that. 


where 

Let 


g  (t.)  *  ^ 

V  VfT 


1  <  i  <  K 


1  <  (K+1) 


5 

w 


VrT 


Note  that 


for  tj  <  £  <  .  0  <  1  <  K.  .  0,  -  I 


Z„*(0)  .  Z„*{1 )  .  0 


An  elementary  application  of  corollary  2.5  shows  that, 


E(Z*(t)  -  Z*(3))^  <  M(Q^(t)-Q^(s))^  alls, 


(5.17) 


where  M  aepends  on  i  but  is  independent  of  n.  Since,  under  A(iii)  and  3, 
dominated  convergence  implies  that  for  each  y, 


Snly) 


►/  f(x)|l-exp 


-1 

!T 


dx 


a  continuous  probability  distribution;  it  follows  from  a  slight  modification 
of  Billingsley  ((1968),  Theorems  12.3  and  12.4)  that  {Z^}  is  tight  and  that 
all  limit  points  of  {Z^}  are  in  CCO,!], 

Next  note  that 


(5.18)  sup^!Z^(t)-Z^(t)i  <  max  sup{|Z^(t)-Z^(t.)|  :  t^.  <  t  < 


(sup.:|Q„(t)-q„(t,)|:  t,  <  t  <  t,„))|Z„(t,*,)-Z„(t,)|:  0  <  1  <  k) 


<  max  j  [|Z^(t,,.,)-Z^(tj)|  + 

*  ^  01' 


using  the  monotonicity  ofHj^(*). 

Next  note  that  integrating  (2.8)  for  j=0,  implies  that  for  C  independent  of  n, 

Vn-£(S„{t,„)-5„(t,))  <  Cyn(Q„(t,„)-0„(t,))iCo  . 

Hence, 

(5.19)  sup^|Z^(t)-Z*(t)|  <  2  max{jZ*(t^.,T)-z’(t^)!  :  0  <  i  <  K}  +  Co 

But  in  view  of  (5.17),  some  elementary  inecualities  give 
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(5.20)  P[max{  !Z*(t.^^)-Z*(t.)l:  0  ±K}  >z] 


0  . 


By  (5. 18)- (5. 20)  for  each  6  >  0,  C  independent  of  5 


(5.21) 


P[sup^|Z^(t)-Z*(t)|  >2C5]  0  . 


Since  {Z*}  is  tight  for  each  6,  (5.21)  implies  tightness  of  {Z^}  and  a.s. 
continuity  of  all  limit  points.  (See,  for  example,  Theorem  4.2  of 
Billingsley  (1968).  Note  that  the  dependence  of  Z*  on  5  is  immaterial.) 

Asymptotic  normality  of  (Z^(t^ ) , . . . ,Z^(t^) )  follows  from  the  represen¬ 
tation  given  in  the  introduction. 


wi  th 


z„(t)  h*(x,.D,) 


h(x,d)  =  I(exp{-g(x)V(d)}  <_t) 


and  the  multivariate  extension  of  theorem  4.1.  Similarly  the  formulae 
(5.11)  and  (5.12)  for  Y(s,t)  may  be  obtained  after  tedious  calculations 
from  the  appropriate  straightforward  generalizations  of  proposition  3.10. 


As  an  immediate  consequence  of  theorem  5.4  and  corollary  5.15  we  have 
Theorem  5.22:  The  tests  which  reject  when  S.|  ^  c(a)  where 

2 

P  {  Z^(t)dt  >c(a)}  =  a 

asymptotically  have  level  a  for  H:  f  =g  and  are  consistent  anainst  all  f  g 
which  satisfy  A  and  B. 


55 


Note :  In  his  thesis  M.  Schilling  (1979)  has  made  a  far  reaching  investigation 
of  the  power  of  this  and  related  tests  against  contiguous  alternatives,  has 
constructed  tables  of  the  asymptotic  null  distribution  of  Sq  for  m  =  1  and 
=0  and  has  studied  the  efficiency  of  the  large  m  and  n  approximation  through 


simulation. 
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APPENDIX 

In  this  appendix  we  give  the  statements  and  proofs  of  several  lemmas 
of  a  technical  or  computational  nature  which  are  used  in  the  previous 
sections.  We  begin  with  a  key  lemma  due  to  Stone  (1977). 

Lemma  S:  For  each  m  and  norm  1|*||  there  exists  a(m)  <  »  such  that  it  is 
possible  to  write  as  the  union  of  a(m)  disjoint  cones  Ci,...,C^  with  0 
as  their  ccmmoh  peak  such  that  if 

X,  y  e  Cj.x.yj'O,  then  [jx-yH  <  max(  Hxtl  ,  |IyI[  ),  j’l,...,a(m) 

The  following  straightforward  modification  of  Stone’s  argument  shows 
that  the  lemma  is  valid  for  any  norm. 

Proof:  3y  compactness  of  the  surface  of  the  unit  sphere  3S(0,1)  we  can 
find  disjoint  sets  such  that, 

(i)  Cj  =  3S{0,1) 

(ii)  X,  y  £  Cj  =>  I Ix-yl |  <  1 


a. 

C..  “  {Xx  :  X£  C  . ,  A  ^  0} ,  j»l , . . .  ,a(m) 


Let 


Suppose  X  »  ax,  y  =77,  x,  y  s  C..  Suppose  w.i.o.g.  a  <  7,  Tnen, 

l|x-y||  <il(l  -A)||3^||  !!i-y|l[  <  llyl! 

The  following  are  easy  corollaries  of  lemma  S. 

Corollary  SI :  For  any  set  of  n  distinct  points,  Xp...,Xj^  in  R*",  x-j  can  be 
the  nearest  neighbor  of  at  most  ct(m)  points. 

Corollary  S2:  If  C-] « •  •  •  are  as  in  lenma  S,  is  arbitrary,  x  e 

then 

S(x,||x-yQl!)  =5  S(yQ,llx-yQll )  n(c>yQ)  . 

The  following  consequence  of  S2  is  needed  for  the  proof  of  lemma  A2  but 
is  of  independent  interest. 

Theorem  A1 :  Let  Y  be  a  random  m  vector  with  distribution  G,  density  g,  and  let 
y^  be  a  fixed  point, 


Q  =  G(S(Y,|lY-y  JD) 


Then, 


(A.2) 


PCQ  1  q]  1  a(m)  q,  0  £  q  £  1 


Pn30f :  First  let  yQ  =  0  and  let  6.  be  the  conditional 


listribution  of  Y|YeCj  and  pj  =  G(Cj),  where  the  Cj  are  given  by  corollary 


S2.  Then, 


(A.3)  PCQ  <  q]  =  l  \p  PCQ  <  q|Y  c  C.]  :  p  >  0 

4  J  w  -J 


But  Y  e  C.  implies  by  corollary  S2  that 

J 

G(S(Y,I|Y||))  >  Pjr,j(S(0, IIYII)  nCj)  . 

Hence,  for  p.  >  0. 

J 

(A. 4)  P[Qiq|Y€Cj.]  <  PCG^  (S(0,  II  Yll ) )  <  q/p^  |  Y  eCj]  =  ^ 

«3 

since,  given  Y  G  C.,  G.(S(0,IIYII ))  has  a  uniform  distribution  on  (0,1).  (A. 2) 

J  J 

and  (A. 3)  imply  (A.l)  if  =  0.  For  the  general  case  shift  everything  by  y^ 
and  apply  corollary  S2  in  full  generality. 

Corollary  A5:  If  Q  is  as  in  theorem  A.l,  r  ^  0 
% 

E(1-Q)'"q  <  M(r+1)"^ 

where  M  depends  only  on  m. 

Proof:  Since  0  £  Q  £  1  we  may  w.l.o.g.  take  r  £  2.  By  integration  by  parts 

.1  , 

E(1-Q)'"Q  =  P[Qiq]{-(l-q)''+rq(l-q)''''dq} 

Jo 

fl  -  , 

<  a(m)r  q‘^(l-q)'^"  dq 

Jo 

T  j-r-l  ,  ^  r-1 

£  r(r-l  )■  a(m)  I  w‘^(l  dw 

JQ 

£  2a(m)r(r-l  )"^ 

<  M(r+1)'^ 


We  proceed  to  lemmas  A6  and  AlO. 
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Lenma  A6:  Let 

=  [X,  ‘X,] 

f,2  •  CXi  •  X.,  R,  is,] 
f,3  •  [J,  =  2  or"j,  =  2] 

■t'lj]  x-i 

Proof:  All  these  estimates  follow  by  symmetry  arguments  as  in  the  proof 
of  lemma  2.27.  We  prove  one  of  the  estimates  of  (A. 8)  as  an  example. 

Note  that  we  may  without  loss  of  generality  take  r  ^  n/i  (say).  Then 


^  +  f2(S)V 

n 


Then 
(A. 7) 

(A. 3) 


(A. 9)  PCF,.nF  ]  <  [(n-r)(n-r-l)]’''E[  I  KF..)  I  (I(J.  =k)  +I(J.  =k))] 

12  u  -  1^3-,  1  T 


n-r 


n-r 


<  8a(m)n"‘^E(N+r) 


by  corollary  SI.  But 

8a(m)n  ^E(N-^r)  ^ 


Clearly  the  bounds  (A. 7)  and  (A. 8)  are  overestimates  in  this  case.  We 
nave  written  the  ’emma  in  this  way  'or  comoactness. 
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Lemma  A10:  With  the  same  definitions  for  j  »  1,2, 

2 

(A. 11)  E  i<  +  f2(S)^ 

2 

(A. 12)  E  I(F^j.)  F(S^)  f2(S)^ 

2  \ 

(A. 13)  E  r(F^j)  F(S^)  P^(S)) 

Proof:  a)  j  =  1 

E  I(F„)  ^  F(S)  (l  *  (l  -^)f(S)) 
E  I(F^^)  F(St)  «  P(F^^)  EF(S^)  «  ^ 


Let 


Then, 


R*  =  mind  [X.  -  Xj|  I  :  1  <  j  <  n-r ,  j  f  i } 
E  I(Fit)F(St)  <  E  I(F^^)F(S(d,R*)) 


<  (n-r)"''  F(S)(1-F(S))  +  f2(S) 

The  bounds  (A. 11  -A. 13)  are  immediate  for  r  ^  n/A  and  trivial  (for  large 
enough  M)  for  r  >  n/4. 


E  I(F,2 


(’  ■  '’■^''12  ''21^  i 


tN^.iS+r) 

n(n-r-2) 


<  M  F(S)  +  F^(S)) 

for  r  £  n/4  and  (A. 11)  follows. 

To  prove  (A. 12)  begin  by  writing, 

(A. 14)  E  r(F^2^''  ^^1^  1  ^ 

+  E  I(X^  =  X^.R^q  >  Ric)P(St)  +  s'*  E  I(X^  «  X^.R^q 

j  ^ 


where, 

R^q  *  minCl IXj  -  X^  j 1  ^  *  Xj,  jfl ,  1  1  j  <  n-r} 

'  *  'nind  |Xj.  -  1 1  :  Xj  f  X^,  jfl ,  1  £  j  <  n-r} 

Then,  we  bound 

£  KX,  =  X,,R,  <  R,)F{St)  1  £  I(X.  H  X.  )F(S,)  =  n"'' 


(A. 15) 
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a,  % 

Given  can  apply  corollary  A.l  noting  that  F2(S(X^ , 1  j -x^ j  j ) ) 

has  the  distribution  of  Q  with  G  »  F-,  x.  =  y  Since  conditionally  K-l 

J  J  W 

has  a  binomial  (n-r-1,  l-F(S))  distribution,  we  obtain  as  a  bound  for  (A. 17), 
(A. 18)  Me(;<’2|x^»X^)  <  |M(l.F(S))’^(n-r)’2 

Therefore,  we  obtain 

(A.l 9)  ^  ^(s)) 

for  i  T  ’  -J 

Combining  (A. 15),  (A. 16)  and  (A. 17)  we  obtain  (A. 12)  for  j  = 2,  since  the 
restrictions  on  r  and  F  can  be  absorbed  into  M  for  the  final  bound. 

Finally, 

(A. 20)  E  I(Fi2)^1  <  R^)F(S^)  +  E  KX^  .Xj^  fXj^  )F(S(X^  ,R^q)  ) 

Tjie  first  term  in  (A. 20)  has  been  bounded  in  (A. 14)  and  (A. 19).  The 
second  is  bounded  as  in  (A.l 5)  by 

F(S)  E(i!X^=X^)  <  M  F(S)  J  ,  F(S)  <  j  , 
r  <  n/4.  (A. 13)  follows  for  j*2  and  the  lemma  is  proved. 
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